##   X Index               Title                 Artist      TopGenre Year
## 1 1  1794           Sacrifice                  Anouk   dutch indie 1998
## 2 2   774    Hou Vol Hou Vast                   BLØF     dutch pop 2018
## 3 3   625 Three Days In A Row                  Anouk   dutch indie 2015
## 4 4   412    Peter Gunn Theme Emerson, Lake & Palmer    album rock 2010
## 5 5   606     Het Dorp - Live          Wim Sonneveld dutch cabaret 2015
## 6 6   754         Malle Babbe            Rob De Nijs     dutch pop 2018
##   Beats.Per.Minute..BPM. Energy Danceability Loudness..dB. Liveness Valence
## 1                    136     17           54           -13        9      24
## 2                     86     61           51            -5        8      23
## 3                    171     50           36            -6       16      39
## 4                    131     83           43            -7       92      71
## 5                    114     44           37           -15       67      45
## 6                     87     38           35           -10       12      53
##   Length..Duration. Acousticness Speechiness Popularity      Abbrev
## 1               238           74           5         11 alternative
## 2               295            0           2         12         pop
## 3               254            0           3         13 alternative
## 4               217            1           3         14        rock
## 5               198           82           8         15       other
## 6               253           73           4         15         pop

Exploratory Data Analysis

##                          BPM      Energy Danceability          dB    Liveness
## BPM             1.0000000000  0.15664444  -0.14060233  0.09292650  0.01625639
## Energy          0.1566444353  1.00000000   0.13961627  0.73571088  0.17411770
## Danceability   -0.1406023300  0.13961627   1.00000000  0.04423531 -0.10306258
## dB              0.0929265007  0.73571088   0.04423531  1.00000000  0.09825705
## Liveness        0.0162563857  0.17411770  -0.10306258  0.09825705  1.00000000
## Valence         0.0596532230  0.40517478   0.51456376  0.14704112  0.05066664
## Duration        0.0062516715  0.02280040  -0.13543160 -0.05612653  0.03249854
## Acousticness   -0.1224718133 -0.66515636  -0.13576888 -0.45163499 -0.04620551
## Speechiness     0.0855982110  0.20586499   0.12522900  0.12508975  0.09259447
## popularity_cat -0.0005116562  0.07732221   0.10725015  0.12941612 -0.08073505
## genre_cat_Num  -0.0093962893  0.01656957   0.07082700  0.07949270  0.03985214
##                    Valence     Duration Acousticness Speechiness popularity_cat
## BPM             0.05965322  0.006251671 -0.122471813  0.08559821  -0.0005116562
## Energy          0.40517478  0.022800396 -0.665156355  0.20586499   0.0773222082
## Danceability    0.51456376 -0.135431600 -0.135768879  0.12522900   0.1072501519
## dB              0.14704112 -0.056126527 -0.451634993  0.12508975   0.1294161168
## Liveness        0.05066664  0.032498536 -0.046205511  0.09259447  -0.0807350510
## Valence         1.00000000 -0.203689536 -0.239729075  0.10710188   0.0987226169
## Duration       -0.20368954  1.000000000 -0.102318918 -0.02782584  -0.0619802162
## Acousticness   -0.23972907 -0.102318918  1.000000000 -0.09825610  -0.0573931780
## Speechiness     0.10710188 -0.027825837 -0.098256101  1.00000000   0.0829758481
## popularity_cat  0.09872262 -0.061980216 -0.057393178  0.08297585   1.0000000000
## genre_cat_Num  -0.06341676 -0.072497611 -0.006356929  0.10494913  -0.0567017949
##                genre_cat_Num
## BPM             -0.009396289
## Energy           0.016569567
## Danceability     0.070827001
## dB               0.079492696
## Liveness         0.039852139
## Valence         -0.063416760
## Duration        -0.072497611
## Acousticness    -0.006356929
## Speechiness      0.104949133
## popularity_cat  -0.056701795
## genre_cat_Num    1.000000000
##           BPM        Energy  Danceability            dB      Liveness 
##      1.074835      4.102349      1.511429      2.432290      1.072964 
##       Valence      Duration  Acousticness   Speechiness genre_cat_Num 
##      1.851642      1.110300      1.870871      1.083343      1.047772

Part 1: Full Dataset Models

Linear Discriminant Analysis (LDA) on Full Dataset

## Time difference of 0.008840084 secs
##          
## lda.class   0   1
##         0 567 398
##         1 424 605
## [1] 0.5877633
## [1] 0.4122367

Quadratic Discriminant Analysis (QDA) on Full Dataset

## Time difference of 0.006765842 secs
##          
## qda.class   0   1
##         0 498 320
##         1 493 683
## [1] 0.5922768
## [1] 0.4077232

Logistic Regression on Full Dataset

## Time difference of 0.005219936 secs
##         
## glm.pred   0   1
##        0 567 398
##        1 424 605
## [1] 0.5877633
## [1] 0.4122367

K-Nearest Neighbors (KNN) on Full Dataset

## Time difference of 0.04465508 secs
##         
## knn.pred   0   1
##        0 687 352
##        1 304 651
## [1] 0.671013
## [1] 0.328987
## Time difference of 0.03402019 secs
##          
## knn.pred3   0   1
##         0 763 244
##         1 228 759
## [1] 0.7632899
## [1] 0.2367101
## Time difference of 0.0342679 secs
##          
## knn.pred5   0   1
##         0 718 309
##         1 273 694
## [1] 0.7081244
## [1] 0.2918756
## Time difference of 0.04007506 secs
##           
## knn.pred10   0   1
##          0 662 355
##          1 329 648
## [1] 0.6569709
## [1] 0.3430291

Part 2: Training and Test Split Models

Linear Discriminant Analysis (LDA) with Train/Test Split

## Time difference of 0.00903821 secs
##           
## lda.class2   0   1
##          0 266 207
##          1 227 297
## [1] 0.5646941
## [1] 0.4353059

Quadratic Discriminant Analysis (QDA) with Train/Test Split

## Time difference of 0.005118847 secs
##           
## qda.class2   0   1
##          0 197 151
##          1 296 353
## [1] 0.551655
## [1] 0.448345

Logistic Regression with Train/Test Split

## Time difference of 0.00689292 secs
##          
## glm.pred2   0   1
##         0 269 207
##         1 224 297
## [1] 0.5677031
## [1] 0.4322969

K-Nearest Neighbors (KNN) with Train/Test Split

## Time difference of 0.01145697 secs
##           
## knn.predT7   0   1
##          0 253 244
##          1 240 260
## [1] 0.5145436
## [1] 0.4854564
## Time difference of 0.01071215 secs
##           
## knn.predT3   0   1
##          0 245 261
##          1 248 243
## [1] 0.4894684
## [1] 0.5105316
## Time difference of 0.01069403 secs
##           
## knn.predT5   0   1
##          0 246 250
##          1 247 254
## [1] 0.5015045
## [1] 0.4984955
## Time difference of 0.0113771 secs
##            
## knn.predT10   0   1
##           0 257 243
##           1 236 261
## [1] 0.5195587
## [1] 0.4804413

Part 3: 5-Fold Cross-Validation

Logistic Regression with 5-Fold CV

## Time difference of 0.123796 secs
## [1] 0.5914787 0.5313283 0.5864662 0.5513784 0.5804020
## [1] 0.5682107
## [1] 0.01154756

##       TRUTH
## OUTPUT   0   1
##      0 544 414
##      1 447 589

LDA with 5-Fold CV

## Time difference of 0.0616281 secs
## [1] 0.5939850 0.5313283 0.5864662 0.5538847 0.5804020
## [1] 0.5692132
## [1] 0.0116334

##       TRUTH
## OUTPUT   0   1
##      1 541 409
##      2 450 594

QDA with 5-Fold CV

## Time difference of 0.05892515 secs
## [1] 0.5689223 0.5513784 0.5839599 0.5964912 0.5603015
## [1] 0.5722107
## [1] 0.00810621

##       TRUTH
## OUTPUT   0   1
##      1 492 354
##      2 499 649

KNN with 5-Fold CV

## Time difference of 0.04586816 secs
## [1] 0.5087719 0.5187970 0.5213033 0.5413534 0.5025126
## [1] 0.5185476
## [1] 0.006634929

##       TRUTH
## OUTPUT   0   1
##      1 531 500
##      2 460 503
## Time difference of 0.04767895 secs
## [1] 0.5288221 0.5363409 0.5187970 0.5463659 0.4899497
## [1] 0.5240551
## [1] 0.009649504

##       TRUTH
## OUTPUT   0   1
##      1 547 505
##      2 444 498
## Time difference of 0.04560113 secs
## [1] 0.5413534 0.5288221 0.5288221 0.5238095 0.5025126
## [1] 0.5250639
## [1] 0.006339286

##       TRUTH
## OUTPUT   0   1
##      1 547 503
##      2 444 500
## Time difference of 0.05186296 secs
## [1] 0.5488722 0.5162907 0.5037594 0.5187970 0.5050251
## [1] 0.5185489
## [1] 0.008143353

##       TRUTH
## OUTPUT   0   1
##      1 542 511
##      2 449 492

Results Summary

Full Dataset Results

##               LDA   QDA Logistic Regression KNN k=3 KNN k=5 KNN k=7 KNN k=10
## Accuracy    0.588 0.581               0.588   0.769   0.718   0.682    0.663
## Sensitivity 0.603 0.681               0.603   0.757   0.692   0.649    0.646
## Specificity 0.572 0.503               0.572   0.770   0.725   0.693    0.668

Train/Test Split Results

##               LDA   QDA Logistic Regression KNN k=3 KNN k=5 KNN k=7 KNN k=10
## Accuracy    0.567 0.544               0.495   0.507   0.520   0.525    0.490
## Sensitivity 0.589 0.700               0.482   0.504   0.516   0.518    0.512
## Specificity 0.499 0.400               0.497   0.499   0.513   0.521    0.454

5-Fold Cross-Validation Results

##               LDA   QDA Logistic Regression KNN k=3 KNN k=5 KNN k=7 KNN k=10
## Accuracy    0.569 0.565               0.569   0.525   0.529   0.528    0.528
## Sensitivity 0.592 0.647               0.587   0.504   0.500   0.500    0.498
## Specificity 0.546 0.496               0.549   0.538   0.549   0.547    0.550

Comparative Boxplot of 5-Fold CV Results

## Loading required package: gplots
## Registered S3 method overwritten by 'gplots':
##   method         from 
##   reorder.factor gdata
## 
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
## 
##     lowess